Mermaid PR architecture-diff action (level-1 default, nested opt-in)#2
Merged
Conversation
…t-in)
Replaces the webview/Playwright PNG approach with an inline Mermaid diagram
that GitHub renders natively in the PR comment — no image, no orphan branch,
no contents:write, and fork-friendly.
How it works:
- Resolve a base ("before") analysis: use the committed
.codeboarding/analysis.json at the PR base if present, else generate one via
a full engine run on the base commit.
- Analyze the PR head incrementally, seeded from the base (stable component
ids), falling back to a full run on cache miss.
- scripts/diff_to_mermaid.py diffs the two analyses (name-based matching;
relation label change => modified) and emits a graph LR with nodes colored
via classDef/class and arrows via positional linkStyle: green added, yellow
modified, red dashed deleted. Escaping, deleted-namespace keying, and a size
guard (GitHub's ~500-edge / 50k-char cap -> changed-only or text fallback).
Rendering:
- Level 1 (flat, top-level) is the default — readable inline, never trips the
size cap.
- nested: true draws depth>1 sub-components as subgraphs (leaf nodes filled,
parent containers outlined). Optional --font-size/--node-padding/spacing emit
an %%{init}%% directive to enlarge nodes.
scripts/run_local.sh mirrors the action for local iteration (fast diff-only or
full pipeline) and writes a browser HTML preview rendered with mermaid.js.
Architecture review · 7 components changedgraph LR
n_Execution_Orchestrator["Execution Orchestrator"]
n_Structural_Analysis_Engine["Structural Analysis Engine"]
n_Diagram_Synthesis_Component["Diagram Synthesis Component"]
n_Interaction_CTA_Provider["Interaction #amp; CTA Provider"]
del_Environment_Context_Resolver["Environment #amp; Context Resolver"]
del_Remote_Job_Orchestrator["Remote Job Orchestrator"]
del_Artifact_Result_Processor["Artifact #amp; Result Processor"]
n_Execution_Orchestrator -- "Passes snapshots to" --> n_Structural_Analysis_Engine
n_Structural_Analysis_Engine -- "Provides diffs to" --> n_Diagram_Synthesis_Component
n_Structural_Analysis_Engine -- "Supplies change data to" --> n_Interaction_CTA_Provider
n_Diagram_Synthesis_Component -- "Integrates links into" --> n_Interaction_CTA_Provider
n_Diagram_Synthesis_Component -- "calls" --> n_Structural_Analysis_Engine
del_Environment_Context_Resolver -- "passes resolved repository URL and branch to in…" --> del_Remote_Job_Orchestrator
del_Remote_Job_Orchestrator -- "hands off JSON response containing full job res…" --> del_Artifact_Result_Processor
del_Artifact_Result_Processor -- "writes extracted documentation data to runner f…" --> del_Artifact_Result_Processor
del_Remote_Job_Orchestrator -- "manages job configuration and polls for status…" --> del_Remote_Job_Orchestrator
classDef added fill:#1f883d,stroke:#0b5d23,color:#ffffff;
classDef modified fill:#bf8700,stroke:#7d4e00,color:#ffffff;
classDef deleted fill:#cf222e,stroke:#82071e,color:#ffffff,stroke-dasharray:5 3;
class n_Execution_Orchestrator,n_Structural_Analysis_Engine,n_Diagram_Synthesis_Component,n_Interaction_CTA_Provider added;
class del_Environment_Context_Resolver,del_Remote_Job_Orchestrator,del_Artifact_Result_Processor deleted;
linkStyle 0,1,2,3,4 stroke:#0b5d23,stroke-width:2px;
linkStyle 5,6,7,8 stroke:#82071e,stroke-width:2px,stroke-dasharray:5 3;
Colours indicate components that have been 🟩 added · 🟨 modified · 🟥 removed — versus 🧭 See this architecture in your editor: Open in VS Code → 💡 New to CodeBoarding? Get the extension → codeboarding-action · run 26942517446 |
…ace/extension CTA via click proxy
…n editor' CTA Move CTA building into scripts/build_cta.py (testable; docstring cites the 2025 SO survey: VS Code 75.9% + Cursor 17.9% ~= 94% justifies VS Code+Cursor only). .vscode dir -> VS Code, .cursor -> Cursor, both -> both, neither -> VS Code.
depth_level drives the engine (deep, rich data); new render_depth controls how many component levels the PR Mermaid draws (default 1 = top-level flat). Replaces the nested boolean. Self-test now runs depth_level=2 + render_depth=1.
…rl), unit tests + CI - engine_ref default main -> v0.12.0 (verified to have the run_full/run_incremental/ health/StaticAnalysisCache API); reproducible default, overridable. - README: engine_ref + cta_base_url rows; depth_level vs render_depth clarified. - tests/: 21 stdlib unittest cases for diff_to_mermaid + build_cta. - .github/workflows/test.yml runs them on push/PR (no LLM).
…cache The manual .venv cache was restored then immediately wiped by 'uv venv --clear', so it never helped. Reuse it when present and enable uv's download cache for cold installs.
Guard now resolves the PR from either a pull_request event or an issue_comment '/codeboarding' command (comment body read from env, never interpolated -> no injection; SHAs fetched via gh api). Reacts 👀 to acknowledge. Configurable via trigger_command. Sticky comment + base_ref now use the resolved PR explicitly.
…e); CTA -> extension-direct - docs/COMMIT_STRATEGY.md: commit analysis.json + health_report.json (text, display-critical); do NOT commit static_analysis.pkl (binary -> actions/cache). Forward-compatible with a future hosted webview. - build_cta: drop the 'explore in browser' webview tier (deferred); footer now drives straight to the extension (open in VS Code/Cursor + get the extension).
…ing)
Security
- CRITICAL: gate the /codeboarding (issue_comment) path on author_association
(OWNER/MEMBER/COLLABORATOR) before any checkout/analysis — closes the
pwn-request/secret-exfiltration hole; README example + security note updated.
- Pass base_ref/header/cta_base via env (not interpolated into bash); extract
engine python -c into scripts/cb_engine.py (no ${{ }}->python-source interp);
preflight prints only the error message, not the whole auth body.
Bugs (diff_to_mermaid.py)
- Escape Mermaid shape metacharacters (] [ ( ) { } | < > &) — a bare ] no longer
breaks GitHub rendering.
- n_changed is now recursive + a `changed` flag covers relation/nested-only
changes; the action shows the diagram on `rendered` instead of suppressing it.
- Size guard counts drawn edges and RETRIES changed-only before giving up.
- _filter_changed ignores empty id/name strings.
- Deleted parent keeps its subtree (renders as a deleted subgraph, symmetric
with added).
action.yml hardening
- Guard: set -uo pipefail + graceful gh-api failure + assert base/head SHAs.
- Generate-base: worktree remove/prune (not just rm -rf); cache the generated
base keyed by base SHA (pay the cold full-analysis once).
- Seed static_analysis.sha with the pkl (enables the committed-baseline
warm-start that was previously unreachable).
- Health step writes 0 first (no stale count on exception).
- Post a failure comment (if: failure()) so a failed run isn't silent.
Workflows
- release-major-tag: anchor the semver regex ($) so prereleases don't move vN.
- test-self + README example: concurrency cancel-in-progress.
Tests: +cb_engine smoke tests, escaping/changed-flag cases (31 total, all green).
Member
Author
|
/codeboarding |
The cb-base cache stores a full LLM analysis whose output depends on agent_model/parsing_model, but the key omitted them: re-running on the same base SHA with a different model served a stale-model 'before' snapshot, producing phantom added/modified/deleted components. Add both models to the restore+save keys. Drop the cb-uv restore-keys fallback: on a lockfile change it restored a venv built from a different uv.lock, and 'uv pip install -e .' does not reconcile already-installed transitive deps back to the lock, silently running mismatched versions. Exact-key hits (unchanged lock) still warm-start.
…check _diff_methods built full per-file sorted added/removed dicts, but only the boolean 'any change?' answer was used — the method_diff payload was attached to every component and never read by the renderer or any test. Replace it with a short-circuiting _has_method_changes() (no sorting, no payload), and fold _has_structural_changes' if/return ladder into one boolean return. Behaviour-identical; 41 tests green.
…able fallback Lock down three untested renderer/engine paths: _truncate's 48-char ellipsis boundary, _esc stripping raw newlines/CR (a raw newline breaks the whole Mermaid block), and run_head falling back to a full analysis on BaselineUnavailableError (only IncrementalCacheMissingError was covered).
Lead with a rendered sample Mermaid diagram so readers see the actual colored output (a visual tool should show, not tell). Drop the marketing lead line, and fold the overlapping 'when it runs', 'on-demand', and trailing security paragraphs into one section that states each fact (auto-once vs /codeboarding, default-branch rule, fork refusal) exactly once.
The agent_model/parsing_model defaults were a hardcoded 'openrouter/anthropic/ claude-sonnet-4' that OpenRouter rejects (400 invalid model id — the engine sends the slug verbatim, so it must be a bare OpenRouter id like anthropic/claude-sonnet-4, no 'openrouter/' prefix). Default both to empty and export AGENT_MODEL/PARSING_MODEL only when set, so an unset model defers to the engine's own valid per-provider default instead of a string that rots over time. Users override via the inputs (optionally wired to repo secrets).
State where the OPENROUTER_API_KEY secret goes (and the env-var path for local runs), and that agent_model/parsing_model are optional — omit for the engine's per-provider default, or pin to a bare OpenRouter slug (inline or via secret).
…' slug run_local.sh still defaulted AGENT_MODEL/PARSING_MODEL to openrouter/anthropic/ claude-sonnet-4 — the litellm-prefixed string that 400s against the engine's OpenRouter ChatOpenAI path (the bug just fixed in action.yml). Default both to empty and export only when set, so the local harness mirrors the action and an unset model falls through to the engine's own valid per-provider default.
…at rule A model name is non-sensitive config, so steer overrides to a repo variable (vars.AGENT_MODEL) and keep secrets. for the key. State the format rule once: bare OpenRouter slug, exactly one '/', no 'openrouter/' prefix.
The engine calls OpenRouter natively (langchain ChatOpenAI), so the model must be a bare slug; the litellm 'openrouter/...' form 400s deep inside the engine. Fail fast in the verify step with a one-line actionable error instead, the same early-validation the API key already gets.
The diff step expanded "${CHANGED_ONLY[@]}" etc. directly; on macOS's bash
3.2 an empty array under 'set -u' errors as 'unbound variable', so the full
local pipeline crashed at diff->mermaid whenever the optional flags weren't
passed. Use the ${arr[@]+"${arr[@]}"} idiom (elements when set, nothing when
empty) so it works on bash 3.2 and 4+.
Add a 'Bring your own LLM provider' section: lead with the <NAME>_API_KEY convention + one example, collapse the full provider table, and point at the engine's registry (agents/llm_config.py) as the source of truth so the README can't rot. Scope the 'bare OpenRouter slug' model-format note to OpenRouter.
The action hardcoded OPENROUTER_API_KEY + an OpenRouter-only preflight, so a user with an OpenAI/Anthropic/etc. key couldn't use it. Add an optional 'llm_provider' input that maps the key to the engine's env var by the <NAME>_API_KEY convention (with aws_bedrock/ollama exceptions); the engine then auto-selects the provider. Gate the OpenRouter preflight and the 'openrouter/' model-prefix guard behind provider==openrouter, export only the selected env var in both engine steps, and add llm_provider to the base-cache key. Default stays openrouter, so existing workflows are unchanged. No engine change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Replaces the webview/Playwright PNG approach with an inline Mermaid diagram GitHub renders natively in the PR comment.
.codeboarding/analysis.jsonif present, else a full engine run on the base commit; head analyzed incrementally from it.scripts/diff_to_mermaid.pydiffs the two analyses and emits a coloredgraph LR(green added / yellow modified / red deleted, nodes + arrows).nested: truefor depth>1 subgraphs.scripts/run_local.shfor local iteration.The
test-self.ymlworkflow runs this action on this PR as a live test (remove before merge). This PR comment thread should get an architecture-diff comment shortly.